AITopics | stable point

Collaborating Authors

stable point

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

4 ways to fix 'tech neck,' according to a physical therapist

Strengthening can help if you're staring at your phone too much. You don't need a ton of equipment to fix your neck. Breakthroughs, discoveries, and DIY tips sent every weekday. If you're here seeking relief from tech neck, or the forward head posture associated with the use of personal devices, we've got good and bad news. The good news is you've come to the right place; the bad news is you're probably contributing to it right now.

physical therapist, posture, shoulder blade, (14 more...)

Popular Science

Country:

Asia > Middle East > Jordan (0.07)
North America > United States > New York > Albany County > Albany (0.05)

Genre: Instructional Material (0.40)

Industry: Health & Medicine > Consumer Health (0.70)

Technology: Information Technology > Artificial Intelligence (0.50)

Add feedback

Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective

Wang, Siwei, Shen, Yifei, Sun, Haoran, Feng, Shi, Teng, Shang-Hua, Dong, Li, Hao, Yaru, Chen, Wei

arXiv.org Machine LearningSep-30-2025

Recent reinforcement learning (RL) methods have substantially enhanced the planning capabilities of Large Language Models (LLMs), yet the theoretical basis for their effectiveness remains elusive. In this work, we investigate RL's benefits and limitations through a tractable graph-based abstraction, focusing on policy gradient (PG) and Q-learning methods. Our theoretical analyses reveal that supervised fine-tuning (SFT) may introduce co-occurrence-based spurious solutions, whereas RL achieves correct planning primarily through exploration, underscoring exploration's role in enabling better generalization. However, we also show that PG suffers from diversity collapse, where output diversity decreases during training and persists even after perfect accuracy is attained. By contrast, Q-learning provides two key advantages: off-policy learning and diversity preservation at convergence. We further demonstrate that careful reward design is necessary to prevent reward hacking in Q-learning. Finally, applying our framework to the real-world planning benchmark Blocksworld, we confirm that these behaviors manifest in practice. Planning is a fundamental cognitive construct that underpins human intelligence, shaping our ability to organize tasks, coordinate activities, and formulate complex solutions such as mathematical proofs. It enables humans to decompose complex goals into manageable steps, anticipate potential challenges, and maintain coherence during problem solving. Similarly, planning plays a pivotal role in state-of-the-art Large Language Models (LLMs), enhancing their ability to address structured and long-horizon tasks with greater accuracy and reliability. Early generations of LLMs primarily relied on next-token prediction and passive statistical learning, which limited their planning capabilities to short-horizon, reactive responses.

accuracy, arxiv preprint arxiv, q-learning, (13 more...)

arXiv.org Machine Learning

2509.22613

Country:

North America > United States > California (0.14)
Asia (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

The Decoupled Risk Landscape in Performative Prediction

Sanguino, Javier, Kehrenberg, Thomas, Lozano, Jose A., Quadrianto, Novi

arXiv.org Artificial IntelligenceJun-11-2025

Performative Prediction addresses scenarios where deploying a model induces a distribution shift in the input data, such as individuals modifying their features and reapplying for a bank loan after rejection. Literature has had a theoretical perspective giving mathematical guarantees for convergence (either to the stable or optimal point). We believe that visualization of the loss landscape can complement this theoretical advances with practical insights. Therefore, (1) we introduce a simple decoupled risk visualization method inspired in the two-step process that performative prediction is. Our approach visualizes the risk landscape with respect to two parameter vectors: model parameters and data parameters. We use this method to propose new properties of the interest points, to examine how existing algorithms traverse the risk landscape and perform under more realistic conditions, including strategic classification with non-linear models. (2) Building on this decoupled risk visualization, we introduce a novel setting - extended Performative Prediction - which captures scenarios where the distribution reacts to a model different from the decision-making one, reflecting the reality that agents often lack full access to the deployed model.

artificial intelligence, machine learning, optimal point, (16 more...)

arXiv.org Artificial Intelligence

2506.09044

Country:

Europe > United Kingdom > England > East Sussex > Brighton (0.04)
Europe > Spain > Basque Country > Biscay Province > Bilbao (0.04)
Asia > Middle East > Israel > Southern District > Eilat (0.04)

Genre:

Workflow (0.66)
Research Report (0.64)

Industry: Banking & Finance (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Feedforward Learning of Mixture Models

Matthew Lawlor, Steven W. Zucker

Neural Information Processing SystemsFeb-9-2025, 22:36:57 GMT

We develop a biologically-plausible learning rule that provably converges to the class means of general mixture models. This rule generalizes the classical BCM neural rule within a tensor framework, substantially increasing the generality of the learning problem it solves. It achieves this by incorporating triplets of samples from the mixtures, which provides a novel information processing interpretation to spike-timing-dependent plasticity. We provide both proofs of convergence, and a close fit to experimental data on STDP.

artificial intelligence, machine learning, spike, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Connecticut > New Haven County > New Haven (0.04)
North America > United States > New York (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Tight Lower Bounds and Improved Convergence in Performative Prediction

Khorsandi, Pedram, Gupta, Rushil, Mofakhami, Mehrnaz, Lacoste-Julien, Simon, Gidel, Gauthier

arXiv.org Artificial IntelligenceDec-4-2024

Performative prediction is a framework accounting for the shift in the data distribution induced by the prediction of a model deployed in the real world. Ensuring rapid convergence to a stable solution where the data distribution remains the same after the model deployment is crucial, especially in evolving environments. This paper extends the Repeated Risk Minimization (RRM) framework by utilizing historical datasets from previous retraining snapshots, yielding a class of algorithms that we call Affine Risk Minimizers and enabling convergence to a performatively stable point for a broader class of problems. We introduce a new upper bound for methods that use only the final iteration of the dataset and prove for the first time the tightness of both this new bound and the previous existing bounds within the same regime. We also prove that utilizing historical datasets can surpass the lower bound for last iterate RRM, and empirically observe faster convergence to the stable point on various perfor-mative prediction benchmarks. We offer at the same time the first lower bound analysis for RRM within the class of Affine Risk Min-imizers, quantifying the potential improvements in convergence speed that could be achieved with other variants in our framework.

lower bound and improved convergence, stable point, tight lower bound, (13 more...)

arXiv.org Artificial Intelligence

2412.03671

Country:

North America > United States > California (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > Canada > Quebec (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

On the Conditions for Domain Stability for Machine Learning: a Mathematical Approach

Pedroza, Gabriel

arXiv.org Machine LearningNov-30-2024

This work proposes a mathematical approach that (re)defines a property of Machine Learning models named stability and determines sufficient conditions to validate it. Machine Learning models are represented as functions, and the characteristics in scope depend upon the domain of the function, what allows us to adopt topological and metric spaces theory as a basis. Finally, this work provides some equivalences useful to prove and test stability in Machine Learning models. The results suggest that whenever stability is aligned with the notion of function smoothness, then the stability of Machine Learning models primarily depends upon certain topological, measurable properties of the classification sets within the ML model domain.

artificial intelligence, machine learning, stable point, (14 more...)

arXiv.org Machine Learning

2412.00464

Country:

North America > United States > New York > New York County > New York City (0.05)
Europe > France (0.04)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Reviews: A Bridging Framework for Model Optimization and Deep Propagation

Neural Information Processing SystemsOct-7-2024, 05:58:03 GMT

Paper summary: The paper proposed a learning based hybrid proximal gradient method for composite minimization problems. The iteration is divided into two modules: the learning module does data fidelity minimization with certain network-based priors; consequently the optimization module generates strict convergence propagations by applying proximal gradient feedback on the output of the learning module. The generated iterates were shown to be a Cauchy sequence converging to the critical points of the original objective. The method was applied to image restoration tasks with performance evaluated. Comments: The core idea is to develop a learning based optimization module to incorporate domain knowledge into conventional proximal gradient descent procedure.

model optimization and deep propagation, module, procedure, (10 more...)

Neural Information Processing Systems

Genre: Instructional Material > Course Syllabus & Notes (0.51)

Technology: Information Technology > Artificial Intelligence (0.37)

Add feedback

A Dynamic Model of Performative Human-ML Collaboration: Theory and Empirical Evidence

Sühr, Tom, Samadi, Samira, Farronato, Chiara

arXiv.org Artificial IntelligenceJun-6-2024

Machine learning (ML) models are increasingly used in various applications, from recommendation systems in e-commerce to diagnosis prediction in healthcare. In this paper, we present a novel dynamic framework for thinking about the deployment of ML models in a performative, human-ML collaborative system. In our framework, the introduction of ML recommendations changes the data generating process of human decisions, which are only a proxy to the ground truth and which are then used to train future versions of the model. We show that this dynamic process in principle can converge to different stable points, i.e. where the ML model and the Human+ML system have the same performance. Some of these stable points are suboptimal with respect to the actual ground truth. We conduct an empirical user study with 1,408 participants to showcase this process. In the study, humans solve instances of the knapsack problem with the help of machine learning predictions. This is an ideal setting because we can see how ML models learn to imitate human decisions and how this learning process converges to a stable point. We find that for many levels of ML performance, humans can improve the ML predictions to dynamically reach an equilibrium performance that is around 92% of the maximum knapsack value. We also find that the equilibrium performance could be even higher if humans rationally followed the ML recommendations. Finally, we test whether monetary incentives can increase the quality of human decisions, but we fail to find any positive effect. Our results have practical implications for the deployment of ML models in contexts where human decisions may deviate from the indisputable ground truth.

artificial intelligence, ground truth, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2405.13753

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Europe > Italy > Apulia > Bari (0.04)
Europe > France (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (1.00)
Information Technology > Security & Privacy (0.46)
Education > Educational Setting (0.46)
Information Technology > Services (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.36)

Add feedback

Feedforward Learning of Mixture Models

Neural Information Processing SystemsMar-13-2024, 13:23:21 GMT

decomposition, spike, triplet, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Connecticut > New Haven County > New Haven (0.04)
North America > United States > New York (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Differential Privacy of Noisy (S)GD under Heavy-Tailed Perturbations

Şimşekli, Umut, Gürbüzbalaban, Mert, Yıldırım, Sinan, Zhu, Lingjiong

arXiv.org Machine LearningMar-4-2024

Injecting heavy-tailed noise to the iterates of stochastic gradient descent (SGD) has received increasing attention over the past few years. While various theoretical properties of the resulting algorithm have been analyzed mainly from learning theory and optimization perspectives, their privacy preservation properties have not yet been established. Aiming to bridge this gap, we provide differential privacy (DP) guarantees for noisy SGD, when the injected noise follows an $\alpha$-stable distribution, which includes a spectrum of heavy-tailed distributions (with infinite variance) as well as the Gaussian distribution. Considering the $(\epsilon, \delta)$-DP framework, we show that SGD with heavy-tailed perturbations achieves $(0, \tilde{\mathcal{O}}(1/n))$-DP for a broad class of loss functions which can be non-convex, where $n$ is the number of data points. As a remarkable byproduct, contrary to prior work that necessitates bounded sensitivity for the gradients or clipping the iterates, our theory reveals that under mild assumptions, such a projection step is not actually necessary. We illustrate that the heavy-tailed noising mechanism achieves similar DP guarantees compared to the Gaussian case, which suggests that it can be a viable alternative to its light-tailed counterparts.

assumption 1, assumption 3, gradient descent, (11 more...)

arXiv.org Machine Learning

2403.02051

Country:

North America > United States > New York (0.04)
North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
North America > United States > Florida > Leon County > Tallahassee (0.04)
(4 more...)

Genre: Research Report > New Finding (0.69)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.56)

Add feedback